We present a simple but novel hybrid approach to hyperspectral data cube reconstruction from computed tomography imaging spectrometry (CTIS) images that sequentially combines neural networks and the iterative Expectation Maximization (EM) algorithm. We train and test the ability of the method to reconstruct data cubes of $100\times100\times25$ and $100\times100\times100$ voxels, corresponding to 25 and 100 spectral channels, from simulated CTIS images generated by our CTIS simulator. The hybrid approach utilizes the inherent strength of the Convolutional Neural Network (CNN) with regard to noise and its ability to yield consistent reconstructions and make use of the EM algorithm's ability to generalize to spectral images of any object without training. The hybrid approach achieves better performance than both the CNNs and EM alone for seen (included in CNN training) and unseen (excluded from CNN training) cubes for both the 25- and 100-channel cases. For the 25 spectral channels, the improvements from CNN to the hybrid model (CNN + EM) in terms of the mean-squared errors are between 14-26%. For 100 spectral channels, the improvements between 19-40% are attained with the largest improvement of 40% for the unseen data, to which the CNNs are not exposed during the training.
translated by 谷歌翻译
The number of international benchmarking competitions is steadily increasing in various fields of machine learning (ML) research and practice. So far, however, little is known about the common practice as well as bottlenecks faced by the community in tackling the research questions posed. To shed light on the status quo of algorithm development in the specific field of biomedical imaging analysis, we designed an international survey that was issued to all participants of challenges conducted in conjunction with the IEEE ISBI 2021 and MICCAI 2021 conferences (80 competitions in total). The survey covered participants' expertise and working environments, their chosen strategies, as well as algorithm characteristics. A median of 72% challenge participants took part in the survey. According to our results, knowledge exchange was the primary incentive (70%) for participation, while the reception of prize money played only a minor role (16%). While a median of 80 working hours was spent on method development, a large portion of participants stated that they did not have enough time for method development (32%). 25% perceived the infrastructure to be a bottleneck. Overall, 94% of all solutions were deep learning-based. Of these, 84% were based on standard architectures. 43% of the respondents reported that the data samples (e.g., images) were too large to be processed at once. This was most commonly addressed by patch-based training (69%), downsampling (37%), and solving 3D analysis tasks as a series of 2D tasks. K-fold cross-validation on the training set was performed by only 37% of the participants and only 50% of the participants performed ensembling based on multiple identical models (61%) or heterogeneous models (39%). 48% of the respondents applied postprocessing steps.
translated by 谷歌翻译
This white paper lays out a vision of research and development in the field of artificial intelligence for the next decade (and beyond). Its denouement is a cyber-physical ecosystem of natural and synthetic sense-making, in which humans are integral participants$\unicode{x2014}$what we call ''shared intelligence''. This vision is premised on active inference, a formulation of adaptive behavior that can be read as a physics of intelligence, and which inherits from the physics of self-organization. In this context, we understand intelligence as the capacity to accumulate evidence for a generative model of one's sensed world$\unicode{x2014}$also known as self-evidencing. Formally, this corresponds to maximizing (Bayesian) model evidence, via belief updating over several scales: i.e., inference, learning, and model selection. Operationally, this self-evidencing can be realized via (variational) message passing or belief propagation on a factor graph. Crucially, active inference foregrounds an existential imperative of intelligent systems; namely, curiosity or the resolution of uncertainty. This same imperative underwrites belief sharing in ensembles of agents, in which certain aspects (i.e., factors) of each agent's generative world model provide a common ground or frame of reference. Active inference plays a foundational role in this ecology of belief sharing$\unicode{x2014}$leading to a formal account of collective intelligence that rests on shared narratives and goals. We also consider the kinds of communication protocols that must be developed to enable such an ecosystem of intelligences and motivate the development of a shared hyper-spatial modeling language and transaction protocol, as a first$\unicode{x2014}$and key$\unicode{x2014}$step towards such an ecology.
translated by 谷歌翻译
While the brain connectivity network can inform the understanding and diagnosis of developmental dyslexia, its cause-effect relationships have not yet enough been examined. Employing electroencephalography signals and band-limited white noise stimulus at 4.8 Hz (prosodic-syllabic frequency), we measure the phase Granger causalities among channels to identify differences between dyslexic learners and controls, thereby proposing a method to calculate directional connectivity. As causal relationships run in both directions, we explore three scenarios, namely channels' activity as sources, as sinks, and in total. Our proposed method can be used for both classification and exploratory analysis. In all scenarios, we find confirmation of the established right-lateralized Theta sampling network anomaly, in line with the temporal sampling framework's assumption of oscillatory differences in the Theta and Gamma bands. Further, we show that this anomaly primarily occurs in the causal relationships of channels acting as sinks, where it is significantly more pronounced than when only total activity is observed. In the sink scenario, our classifier obtains 0.84 and 0.88 accuracy and 0.87 and 0.93 AUC for the Theta and Gamma bands, respectively.
translated by 谷歌翻译
There are multiple scales of abstraction from which we can describe the same image, depending on whether we are focusing on fine-grained details or a more global attribute of the image. In brain mapping, learning to automatically parse images to build representations of both small-scale features (e.g., the presence of cells or blood vessels) and global properties of an image (e.g., which brain region the image comes from) is a crucial and open challenge. However, most existing datasets and benchmarks for neuroanatomy consider only a single downstream task at a time. To bridge this gap, we introduce a new dataset, annotations, and multiple downstream tasks that provide diverse ways to readout information about brain structure and architecture from the same image. Our multi-task neuroimaging benchmark (MTNeuro) is built on volumetric, micrometer-resolution X-ray microtomography images spanning a large thalamocortical section of mouse brain, encompassing multiple cortical and subcortical regions. We generated a number of different prediction challenges and evaluated several supervised and self-supervised models for brain-region prediction and pixel-level semantic segmentation of microstructures. Our experiments not only highlight the rich heterogeneity of this dataset, but also provide insights into how self-supervised approaches can be used to learn representations that capture multiple attributes of a single image and perform well on a variety of downstream tasks. Datasets, code, and pre-trained baseline models are provided at: https://mtneuro.github.io/ .
translated by 谷歌翻译
Transformers have recently gained attention in the computer vision domain due to their ability to model long-range dependencies. However, the self-attention mechanism, which is the core part of the Transformer model, usually suffers from quadratic computational complexity with respect to the number of tokens. Many architectures attempt to reduce model complexity by limiting the self-attention mechanism to local regions or by redesigning the tokenization process. In this paper, we propose DAE-Former, a novel method that seeks to provide an alternative perspective by efficiently designing the self-attention mechanism. More specifically, we reformulate the self-attention mechanism to capture both spatial and channel relations across the whole feature dimension while staying computationally efficient. Furthermore, we redesign the skip connection path by including the cross-attention module to ensure the feature reusability and enhance the localization power. Our method outperforms state-of-the-art methods on multi-organ cardiac and skin lesion segmentation datasets without requiring pre-training weights. The code is publicly available at https://github.com/mindflow-institue/DAEFormer.
translated by 谷歌翻译
The ability to convert reciprocating, i.e., alternating, actuation into rotary motion using linkages is hindered fundamentally by their poor torque transmission capability around kinematic singularity configurations. Here, we harness the elastic potential energy of a linear spring attached to the coupler link of four-bar mechanisms to manipulate force transmission around the kinematic singularities. We developed a theoretical model to explore the parameter space for proper force transmission in slider-crank and rocker-crank four-bar kinematics. Finally, we verified the proposed model and methodology by building and testing a macro-scale prototype of a slider-crank mechanism. We expect this approach to enable the development of small-scale rotary engines and robotic devices with closed kinematic chains dealing with serial kinematic singularities, such as linkages and parallel manipulators.
translated by 谷歌翻译
This paper considers a combination of actuation tendons and measurement strings to achieve accurate shape sensing and direct kinematics of continuum robots. Assuming general string routing, a methodical Lie group formulation for the shape sensing of these robots is presented. The shape kinematics is expressed using arc-length-dependent curvature distributions parameterized by modal functions, and the Magnus expansion for Lie group integration is used to express the shape as a product of exponentials. The tendon and string length kinematic constraints are solved for the modal coefficients and the configuration space and body Jacobian are derived. The noise amplification index for the shape reconstruction problem is defined and used for optimizing the string/tendon routing paths, and a planar simulation study shows the minimal number of strings/tendons needed for accurate shape reconstruction. A torsionally stiff continuum segment is used for experimental evaluation, demonstrating mean (maximal) end-effector absolute position error of less than 2% (5%) of total length. Finally, a simulation study of a torsionally compliant segment demonstrates the approach for general deflections and string routings. We believe that the methods of this paper can benefit the design process, sensing and control of continuum and soft robots.
translated by 谷歌翻译
Image classification with small datasets has been an active research area in the recent past. However, as research in this scope is still in its infancy, two key ingredients are missing for ensuring reliable and truthful progress: a systematic and extensive overview of the state of the art, and a common benchmark to allow for objective comparisons between published methods. This article addresses both issues. First, we systematically organize and connect past studies to consolidate a community that is currently fragmented and scattered. Second, we propose a common benchmark that allows for an objective comparison of approaches. It consists of five datasets spanning various domains (e.g., natural images, medical imagery, satellite data) and data types (RGB, grayscale, multispectral). We use this benchmark to re-evaluate the standard cross-entropy baseline and ten existing methods published between 2017 and 2021 at renowned venues. Surprisingly, we find that thorough hyper-parameter tuning on held-out validation data results in a highly competitive baseline and highlights a stunted growth of performance over the years. Indeed, only a single specialized method dating back to 2019 clearly wins our benchmark and outperforms the baseline classifier.
translated by 谷歌翻译
The availability of frequent and cost-free satellite images is in growing demand in the research world. Such satellite constellations as Landsat 8 and Sentinel-2 provide a massive amount of valuable data daily. However, the discrepancy in the sensors' characteristics of these satellites makes it senseless to use a segmentation model trained on either dataset and applied to another, which is why domain adaptation techniques have recently become an active research area in remote sensing. In this paper, an experiment of domain adaptation through style-transferring is conducted using the HRSemI2I model to narrow the sensor discrepancy between Landsat 8 and Sentinel-2. This paper's main contribution is analyzing the expediency of that approach by comparing the results of segmentation using domain-adapted images with those without adaptation. The HRSemI2I model, adjusted to work with 6-band imagery, shows significant intersection-over-union performance improvement for both mean and per class metrics. A second contribution is providing different schemes of generalization between two label schemes - NALCMS 2015 and CORINE. The first scheme is standardization through higher-level land cover classes, and the second is through harmonization validation in the field.
translated by 谷歌翻译